Bayesian active summarization
نویسندگان
چکیده
Bayesian Active Learning has had significant impact to various NLP problems, but nevertheless it’s application text summarization been explored very little. We introduce Summarization (BAS), as a method of combining active learning methods with state-of-the-art models. Our findings suggest that BAS achieves better and more robust performance, compared random selection, particularly for small data annotation budgets. More specifically, applying model like PEGASUS we managed reach 95% the performance fully trained model, using less than 150 training samples. Furthermore, have reduced standard deviation by 18% conventional selection strategy. Using showcase it is possible leverage large models effectively solve real-world problems limited annotated data.
منابع مشابه
Bayesian Query-Focused Summarization
We present BAYESUM (for “Bayesian summarization”), a model for sentence extraction in query-focused summarization. BAYESUM leverages the common case in which multiple documents are relevant to a single query. Using these documents as reinforcement for query terms, BAYESUM is not afflicted by the paucity of information in short queries. We show that approximate inference in BAYESUM is possible o...
متن کاملBayesian Learning in Text Summarization
The paper presents a Bayesian model for text summarization, which explicitly encodes and exploits information on how human judgments are distributed over the text. Comparison is made against non Bayesian summarizers, using test data from Japanese news texts. It is found that the Bayesian approach generally leverages performance of a summarizer, at times giving it a significant lead over nonBaye...
متن کاملASHRAM: Active Summarization and Markup
Typically, searching for information in a document collection amounts to refining a query and then scanning a large number of documents to determine their relevance. Active Summarization Having Related Active Markup (ASHRAM) is a facility for representing and automatically selecting, marking, and linking useful and/or salient items in a document, to make it easier for the user to determine the ...
متن کاملGraph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملBayesian Active Distance Metric Learning
Distance metric learning is an important component for many tasks, such as statistical classification and content-based image retrieval. Existing approaches for learning distance metrics from pairwise constraints typically suffer from two major problems. First, most algorithms only offer point estimation of the distance metric and can therefore be unreliable when the number of training examples...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2024
ISSN: ['1095-8363', '0885-2308']
DOI: https://doi.org/10.1016/j.csl.2023.101553